Multi-Reference Evaluation for Dialectal Speech Recognition System: A Study for Egyptian ASR
نویسندگان
چکیده
Dialectal Arabic has no standard orthographic representation. This creates a challenge when evaluating an Automatic Speech Recognition (ASR) system for dialect. Since the reference transcription text can vary widely from one user to another, we propose an innovative approach for evaluating dialectal speech recognition using Multi-References. For each recognized speech segments, we ask five different users to transcribe the speech. We combine the alignment for the multiple references, and use the combined alignment to report a modified version of Word Error Rate (WER). This approach is in favor of accepting a recognized word if any of the references typed it in the same form. Our method proved to be more effective in capturing many correctly recognized words that have multiple acceptable spellings. The initial WER according to each of the five references individually ranged between 76.4% to 80.9%. When considering all references combined, the Multi-References MR-WER was found to be 53%.
منابع مشابه
Advances in Dialectal Arabic Speech Recognition: A Study Using Twitter to Improve Egyptian ASR
This paper reports results in building an Egyptian Arabic speech recognition system as an example for under-resourced languages. We investigated different approaches to build the system using 10 hours for training the acoustic model, and results for both grapheme system and phoneme system using MADA. The phoneme-based system shows better results than the grapheme-based system. In this paper, we...
متن کاملCollecting Data for Automatic Speech Recognition Systems in Dialectal Arabic Using Games with a Purpose
Building Automatic Speech Recognition (ASR) systems for spoken languages usually suffer from the problem of limited available transcriptions. Automatic Speech Recognition (ASR) systems require large speech corpora that contain speech and their corresponding transcriptions for training acoustic models. In this paper, we target the Egyptian dialectal Arabic. As other spoken languages, it is mainl...
متن کاملMultilingual speech recognition: the 1996 byblos callhome system
This paper describes the 1996 Byblos Callhome speech recognition system for Spanish and Egyptian Colloquial Arabic. The system uses a combination of Phoneticly Tied-Mixture Gaussian HMMs and State-Clustered Tied-Mixture Gaussian HMMs in a multiple pass decoder. We focus here on the aspects of the system which are language specific and demonstrate the adaptability of the Byblos English system to...
متن کاملAnalysis of Dialectal Influence in Pan-Arabic ASR
In this paper, we analyze the impact of five Arabic dialects on the front-end and pronunciation dictionary components of an Automatic Speech Recognition (ASR) system. We use ASR’s phonetic decision tree as a diagnostic tool to compare the robustness of MFCC and MLP front-ends to dialectal variations in the speech data and found that MLP Bottle-Neck features are less robust to such variations. W...
متن کاملAutomatic speech recognition for an under-resourced language - amharic
In this paper we present the development of an Automatic Speech Recognition System (ASRS) for Amharic using limited available resources and the freely available speech toolkit (HTK). There are phonological, dialectal, orthographic and morphological features of Amharic that challenge the development of ASRSs. The problem of resource scarcity is also a hindrance to the research and development in...
متن کامل